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Multiple sclerosis (MS) is a complex disease with underlying genetic and environmental factors. Although 
the contribution of alleles within the major histocompatibility complex (MHC) are known to exert strong 
effects on MS risk, much remains to be learned about the contributions of loci with more modest effects 
identified by genome-wide association studies (GWASs), as well as loci that remain undiscovered. We use a 
recently developed method to estimate the proportion of variance in disease liability explained by 475,806 
single nucleotide polymorphisms (SNPs) geno typed in 1,854 MS cases and 5,164 controls. We reveal that 
~30% of MS genetic liability is explained by SNPs in this dataset, the majority of which is accounted for by 
common variants. These results suggest that the unaccounted for proportion could be explained by variants 
that are in imperfect linkage disequilibrium with common GWAS SNPs, highlighting the potential 
importance of rare variants in the susceptibility to MS. 

Multiple sclerosis (MS) is an inflammatory disease of the central nervous system, and is the most common 
neurological disorder affecting young adults 1 . Current evidence implicates roles for both environmental 
and genetic factors in the onset and progression of the disease 2 " 4 . The importance of genetic factors in 
MS was recognized early in the study of the disease, and is best illustrated by observations of strong familial 
clustering and a significantly increased risk in first-degree relatives 5 " 7 . Further support for the role of genes in MS 
comes from studies of monozygotic and dizygotic twins, which also indicate a strong genetic component; 
however, heritability estimates from these studies range from roughly 25% to 75% 811 . Alleles of the major 
histocompatibility complex (MHC) are so far known to make the single strongest contribution to MS suscept- 
ibility 12 . In addition, many loci of more modest effect have also recently been identified in genome-wide asso- 
ciation studies (GWASs) 13 " 16 . While risk alleles at the MHC are thought to represent a significant proportion of 
MS genetic susceptibility 13 , the contribution of variants outside of the MHC, specifically those represented by 
single nucleotide polymorphisms (SNPs) genotyped by GWASs, has not been extensively explored. To investigate 
in more detail the role of common GWAS variants in MS susceptibility, we used publically available genotype data 
from the United Kingdom (UK) MS patient and control cohorts 16 and a recently described approach that assesses 
contributions made by all genotyped SNPs, rather than solely risk loci that reach genome-wide significance 17 " 20 . 
From this analysis we show that approximately 30% of the genetic variation in liability to MS is directly explained 
by variants represented by current GWAS arrays. 

Results 

For this study, we used genome-wide genotype data for 475,806 autosomal SNPs collected from 1,854 MS cases 
and 5,164 controls sampled from the UK 16 . After assessing the relatedness between individuals, and thus 
accounting for effects of population structure, we first estimated the proportion of variance explained by all 
autosomal SNPs simultaneously. This analysis revealed that 30.7% (standard error (SE) = 2.05%) of the variance 
in liability to MS is accounted for by SNPs in this dataset. 

We next partitioned SNPs by autosome and recalculated the proportion of variance explained by variants 
found on each chromosome (Table 1); estimated values ranged from —0-8% per chromosome. Not surprisingly, 
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given the known contribution of the MHC, which is located on 
chromosome 6, SNPs on this chromosome account for 8.11% of 
the variance (SE = 0.72%). By calculating the proportion of the 
genome represented by each chromosome (not including the length 
of sex chromosomes), we tested for a correlation between the vari- 
ance explained by each chromosome relative to its size, excluding 
chromosome 6 (Figure 1). Although it was evident that several of the 
smaller chromosomes contributed less to the overall variance than 
several of the larger chromosomes, the overall trend was not signifi- 
cant (r = 0.336, P = 0.136). To assess the contribution made by 
common versus rare variants, we also binned SNPs based on minor 
allele frequency (MAF; Figure 2). From this, we observed that com- 
mon variants (MAF > 0.1; —4-6%), which are most abundantly 



sampled on GWAS arrays, make a greater contribution than rare 
variants (MAF < 0.1; —2.8%). However, because of the unequal 
number of SNPs in each bin, we also binned SNPs by quintile 
(Figure 3). Based on this analysis, we found that all quintiles dis- 
played an equivalent variance, highlighting that no particular fre- 
quency of MAF makes a larger or smaller contribution to MS, and 
that all should be captured and tested. 

Lastly, we carried out an association analysis using only the UK 
GWAS data. We identified 15 associated autosomal SNPs in this 
cohort outside of the MHC with P values <1X10" 5 . These SNPs, 
their positions (hgl8; NCBI Build 36.1), and the nearest RefSeq gene 
to each are listed in Table 2. Using association analysis data, we also 
examined the contribution made by all associated SNPs to the 
observed variance after binning by P value, including those SNPs 
within the MHC (Table 3). 

Discussion 

Using available data from a large UK case-control cohort 16 , we have 
conducted a comprehensive assessment of the contribution of gen- 
ome-wide SNPs on the variance in liability to MS. The power of the 
approach used here is that contributions of genotypes at all available 
loci across the genome (in this case, 475,806), rather than only a set of 
identified MS risk loci, can be accounted for using this method. Thus, 
from our analysis, we conclude that approximately 30% of MS herit- 
ability is explained by variants on current GWAS arrays, including 
SNPs on chromosome 6, which alone account for —8% and reflect 
the major contribution of the MHC. The role of the MHC in MS has 
long been known; specifically, HLA-DRB1*1501 confers a 2-fold 
increase in risk 13 . However, the underlying genetic architecture of 
MS is presumed to be polygenic, involving a large number of loci with 
smaller effects 22 ' 23 . Our findings lend support to this notion, as we 
observed that the genetic contributions of SNPs on autosomes other 
than chromosome 6 were at least in part correlated to autosome 
length. However, this relationship was not significant, and not as 
convincing as that illustrated previously for other polygenic disor- 
ders 17 ' 21 . This might hint at the possibility that some unidentified 
MS risk loci have slightly larger effects than others, which has been 
discussed recently 23 . Additionally, our study was smaller than that 



0.025 



QJ 

E 
o 

to 
O 

E 
o 



0.02 



01 
Q. 

-a 

QJ 
C 

is 

Q. 

X 

9 

3 

C 



0.015 



3 o.oi 



0.005 



y = 0.1041x + 0.0044 
R 2 = 0.11308 



0.01 0.02 0.03 0.04 0.05 0.06 0.07 

Chromosome length (% of total length of all autosomes) 



0.08 



0.09 



Figure 1 | Contribution of GWAS SNPs and chromosome length. The proportion of variance in MS liability explained by SNPs partitioned by autosome 
(based on data from Table 1, excluding chr 6) relative to chromosome size, which was determined by dividing the length of each autosome by the sum of 
the lengths of all autosomes. 



SCIENTIFIC REPORTS | 2 : 770 | DOI: 1 0.1 038/srep00770 



2 





U.Uo 


in 

LU 


U.U / 








0.06 - 












U.Ud 






"S 

,£ 


0.04 






S" 


U.Ud 




CD 




o; 
u 


0.02 


c 




.2 
'C 


0.01 






5 


o - 




Mm 



0.0-0.1 0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5 
MAF 

Figure 2 | Contribution of GWAS SNPs partitioned by minor allele 
frequency. The total proportion of variance explained and standard errors 
for SNPs in each of five MAF bins. The number of SNPs included in each 
bin varied slightly (0.0-0.1%, n = 76046; 0.1-0.2%, n = 112435; 0.2-0.3%, 
n = 97482; 0.3-0.4%, n = 89704; 0.4-0.5%, n = 86625). 

of Yang et al. 17 and Lee et al. 21 , and thus would be comparatively 
underpowered. 

Also notable, we observed that the majority of variation repre- 
sented by GWAS SNPs was explained by common variants with 
MAFs over 0.1%, perhaps not surprisingly given that these outnum- 
bered rare variants. This highlights both, the utility of GWAS arrays, 
which have placed much emphasis on the inclusion of common 
SNPs, and the fact that the use of larger sample sizes in GWAS should 
increase power and yield discoveries of additional risk loci, a point 
that has recently been noted in the context of schizophrenia 21 . 
Importantly though, this observation does not delimit the potentially 
significant role of rare variants in MS. For example, rare variants in 
CYP27B1, a gene essential to vitamin D synthesis, have been reported 
at low frequencies in MS patients, but not in controls (odds ratio = 
4.7) 24 . Rare variants in the TYK2 gene have also more recently been 
shown to influence MS risk 25 . Furthermore, we found that even after 
including the effects of over 400,000 SNPs in this cohort, most of the 
variance in MS liability remains unaccounted for. As has been dis- 
cussed previously in the context of the "missing heritability" of com- 
plex diseases, one of the more likely explanations for this is that 
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Figure 3 | Contribution of GWAS SNPs partitioned by quintile. The total 
proportion of variance explained and standard errors for all SNPs tested 
after binning by quintile. The number of SNPs included in each quintile are 
as follows: 0.0-0.11%, n = 93079; 0.11-0.19%, n = 93074; 0.19-0.28%, 
n = 93076; 0.28-0.39%, n = 93089; 0.39-0.5%, n = 93116). 



Table 2 | Top SNPs from association analysis using UK GWAS 
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GWAS SNPs are in imperfect linkage disequilibrium (LD) with dis- 
ease-causing variants 26 . Again, this points to the possible importance 
of rare variants, as allele frequency differences between causative 
alleles and genotyped SNPs impact LD, and may also implicate a 
potential role for structural variants (e.g., large deletions or duplica- 
tions), which are also only partially represented by neighboring 
SNPs, especially those that are multi-allelic and in regions of the 
genome characterized by segmental duplication 27 . Imputation based 
methods to increase the number of common variants tested can also 
be applied to datasets such as the one used here, but it has recently 
been observed in schizophrenia that the application of imputation 
methods only yielded an approximate 2% increase in heritability 
estimates 21 . 

In conclusion, we estimate that approximately 30% of genetic 
variation in liability to MS is captured by considering all genotyped 
SNPs simultaneously. The remaining missing heritability most likely 
reflects imperfect LD between causal variants and the genotyped 
SNPs. 

Methods 

Genotypes for UK MS cases and controls were obtained from GWAS data recently 
generated by the International Multiple Sclerosis Genetics Consortium and the 
Wellcome Trust Case Control Consortium 2 16 . Estimates of the proportion of vari- 
ance explained were calculated using the Genome-wide Complex Trait Analysis 
(GCTA) tool (http://gump.qimr.edu.au/gcta/) 17 " 21,28 . Genetic relatedness between 
individuals was conducted by principal component analysis using the GCTA tool; for 
this step, the threshold used to identify and remove related individuals was set to a 
pairwise genetic relationship value of >0.025 (no individuals met this criteria). The 
top 20 eigenvectors from this analysis were then used as covariates in a restricted 
maximum likelihood analysis, again conducted within the GCTA tool; this was used 
to estimate the proportion of the variance explained by SNPs at the genome-wide 
level, and after partitioning SNP data by autosomes, MAFs, and quintiles. Assembly 
statistics for GRCh37 (hgl9) were used to calculate autosome lengths (autosome 
length/total length of all autosomes). Association analysis of GWAS SNPs was con- 
ducted using PLINK (http://pngu.mgh.harvard.edu/purcell/plink/) 29 . 
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